Generalised PP-attachment Disambiguation Using Corpus-based Linguistic Diagnostics
نویسنده
چکیده
We propose a new formulation of the PP attachment problem as a 4-way classification which takes into account the argument or adjunct status of the PP. Based on linguistic diagnostics, we train a 4-way classifier that reaches an average accuracy of 73.9% (baseline 66.2%). Compared to a sequence of binary classifiers, the 4-way classifier reaches better performance and individuates a verb's arguments more accurately, thus improving the acquisition of a crucial piece of information for many NLP applications.
منابع مشابه
Disambiguation of English PP Attachment using Multilingual Aligned Data
Prepositional phrase attachment (PP attachment) is a major source of ambiguity in English. It poses a substantial challenge to Machine Translation (MT) between English and languages that are not characterized by PP attachment ambiguity. In this paper we present an unsupervised, bilingual, corpus-based approach to the resolution of English PP attachment ambiguity. As data we use aligned linguist...
متن کاملImproving PP Attachment Disambiguation in a Rule-based Parser
This paper deals with how to enhance the performance of a rule-based parser using statistical Information. PP (Prepositional Phrase) attachment ambiguity is one of the main ambiguities found in parsing. We therefore conducted some experiments on extracting statistical information for PP attachment from a corpus, and on applying such information to a rule-based parser. Two types of information a...
متن کاملPP-Attachment Disambiguation Boosted by a Gigantic Volume of Unambiguous Examples
We present a PP-attachment disambiguation method based on a gigantic volume of unambiguous examples extracted from raw corpus. The unambiguous examples are utilized to acquire precise lexical preferences for PP-attachment disambiguation. Attachment decisions are made by a machine learning method that optimizes the use of the lexical preferences. Our experiments indicate that the precise lexical...
متن کاملCorpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary
This paper deals with two important ambiguities of natural language: prepositional phrase attachment and word sense ambiguity. We propose a new supervised learning method for PPattachment based on a semantically tagged corpus. Because any sufficiently big sense-tagged corpus does not exist, we also propose a new unsupervised context based word sense disambiguation algorithm which amends the tra...
متن کاملAcquiring Selectional Preferences from Untagged Text for Prepositional Phrase Attachment Disambiguation
Extracting information automatically from texts for database representation requires previously well-grouped phrases so that entities can be separated adequately. This problem is known as prepositional phrase (PP) attachment disambiguation. Current PP attachment disambiguation systems require an annotated treebank or they use an Internet connection to achieve a precision of more than 90%. Unfor...
متن کامل